home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
Shareware Grab Bag
/
Shareware Grab Bag.iso
/
007
/
searches.arc
/
SEARCHES.DOC
< prev
next >
Wrap
Text File
|
1987-10-27
|
4KB
|
77 lines
{ SEARCHES -- A unit for rapidly searching a buffer for a string.
Version 1.00 - 10/26/1987 - Initial release
Scott Bussinger
Professional Practice Systems
110 South 131st Street
Tacoma, WA 98444
(206)531-8944
Compuserve 72247,2671
This UNIT for Turbo Pascal version 4.0 provides contains high speed
routines for searching buffers for strings. To include this unit in your
program add SEARCHES to the USES clause in your main program.
The unit actually has two routines which do the same thing in different
ways. The first is BlockPos which takes three parameters; a buffer
containing the data be searched, the size of the buffer and the string to be
searched for. This routine is written in assembly language with inline code
and is very fast since it takes advantage of special comparison instructions
in the 8086 instruction set. Note the the buffer is assumed to be based from
a lower index of 1. The result is the offset of the match within the buffer.
A failure to match returns a 0. This routine was originally written by Randy
Forgaard for use with Turbo Pascal 3.0.
The second search routine is based on the Boyer-Moore search algorithm and
is coded strictly in Pascal. It is MUCH, MUCH slower than BlockPos and is
included only because I felt like converting it and it doesn't hurt anything
to include it in the unit. Actually, the Boyer-Moore algorithm is quite a
good search algorithm for some cases but doesn't do very well here because
BlockPos is written in assembly language and BoyerMoore isn't. I don't expect
anyone will use this routine, but for those who want to, there are two steps
to using the BoyerMoore routines. The first is to process the search string
into a special form for the actual search routine itself. This procedure is
called MakeBoyerTable and converts a string into a special record type called
a BoyerTable. This need only be done once for a given search string. The
second procedure is the search procedure itself and is called BoyerMoore. For
parameters, it takes the buffer containing data to be searched, the size of
that buffer, the starting location in the buffer (in case you want to continue
a previous search from where you left off), and the BoyerTable you created
using MakeBoyerTable. Again, the Buffer is assumed to be based from 1 and the
result is the offset where the match was found (0 indicating not found).
These routines are based on some routines written by Van Hall for Turbo Pascal
3.0 and I've included his original description of the Boyer-Moore algorithm in
a separate file called BOYER.DOC for those who are interested.
Note that in general, the buffer can contain any data and the search string
need not be text but rather any data in a string form. For example, to search
for a CR/LF sequence use #13#10 as the search string.
You can compile this file to create a test program using these routines. It
allows you to enter a search string and looks in this documentation file for
the chosen string. }
program Test;
uses Searches;
var Buffer: array[1..maxint] of char;
Size: word;
Fyle: file;
Str: string;
Table: BoyerTable;
begin
assign(Fyle,'SEARCHES.DOC');
reset(Fyle,1);
blockread(Fyle,Buffer,maxint,Size);
close(Fyle);
repeat
write('String to search for:');
readln(Str);
writeln(' BlockPos = ',BlockPos(Buffer,Size,Str));
MakeBoyerTable(Str,Table);
writeln(' BoyerMoore = ',BoyerMoore(Buffer,Size,1,Table))
until Str = ''
end.